Targeting a light-weight and multi-channel approach for distributed stream processing

نویسندگان

چکیده

Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there been tremendous progress distributed stream processing systems the past few years, low-latency (a.k.a. high sustainable-throughput ) requirement of modern applications is pushing limits traditional infrastructures. This paper introduces new engine (DSPE), called Asynchronous Iterative Routing (or simply “AIR”), which implements light-weight, dynamic sharding protocol. AIR expedites direct asynchronous communication among all worker nodes via channel-like protocol on top Message Passing Interface (MPI), thereby completely avoiding need for dedicated driver node. The system adopts progress-tracking protocol, hew-meld , experimentally observed to show low latency our master-less architecture when compared conventional low-watermark technique. current version also equipped with two fault tolerance recovery strategies namely checkpointing & rollback replication . With its unique design, scales out particularly well multi-core HPC architectures; specifically, we deployed it clusters up 16 448 cores (thus reaching peak 435.3 million events 55.14 GB processed per second), found significantly outperform existing DSPEs. • A open source architectures. Specifically,

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Stream Front-End Processing for Robust Distributed Speech Recognition

This paper investigates a multi-stream-based front-end in Distributed Speech Recognition (DSR). It aims at improving the performance of Hidden Markov Model (HMM)-based systems by combining features based on conventional MFCCs and formant-like features to constitute a new multivariate feature vector. The approach presented in this paper constitutes an alternative to the DSR-XAFE (XAFE: eXtended ...

متن کامل

Distributed Reactive Stream Processing

Reactive programming paradigm successfully overcomes the limitations of observer pattern which has traditionally been used for developing event-driven distributed systems. Due to its declarative style, compositionality and automatic management of dependencies, reactive programming offers a promising new way for building complex distributed data-flow systems. This article outlines some open chal...

متن کامل

Scalable Distributed Stream Processing

Many stream-based applications are naturally distributed. Applications are often embedded in an environment with numerous connected computing devices with heterogeneous capabilities. As data travels from its point of origin (e.g., sensors) downstream to applications, it passes through many computing devices, each of which is a potential target of computation. Furthermore, to cope with time-vary...

متن کامل

a benchmarking approach to optimal asset allocation for insurers and pension funds

uncertainty in the financial market will be driven by underlying brownian motions, while the assets are assumed to be general stochastic processes adapted to the filtration of the brownian motions. the goal of this study is to calculate the accumulated wealth in order to optimize the expected terminal value using a suitable utility function. this thesis introduced the lim-wong’s benchmark fun...

15 صفحه اول

Scalable Planning for Distributed Stream Processing Systems

Recently the problem of automatic composition of workflows has been receiving increasing interest. Initial investigation has shown that designing a practical and scalable composition algorithm for this problem is hard. A very general computational model of a workflow (e.g., BPEL) can be Turingcomplete, which precludes fully automatic analysis of compositions. However, in many applications, work...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Parallel and Distributed Computing

سال: 2022

ISSN: ['1096-0848', '0743-7315']

DOI: https://doi.org/10.1016/j.jpdc.2022.04.022